Journal article

On the e iciency of k-means clustering: Evaluation, optimization, and algorithm selection

S Wang, Y Sun, Z Bao

Proceedings of the VLDB Endowment | ASSOC COMPUTING MACHINERY | Published : 2020

Abstract

This paper presents a thorough evaluation of the existing methods that accelerate Lloyd’s algorithm for fast k-means clustering. To do so, we analyze the pruning mechanisms of existing methods, and summarize their common pipeline into a uni ed evaluation framework UniK. UniK embraces a class of well-known methods and enables a ne-grained performance breakdown. Within UniK, we thoroughly evaluate the pros and cons of existing methods using multiple performance metrics on a number of datasets. Furthermore, we derive an optimized algorithm over UniK, which e ectively hybridizes multiple existing methods for more aggressive pruning. To take this further, we investigate whether the most e cient m..

View full abstract

University of Melbourne Researchers